Skip to content

crypto.kt{128,256} cleanups & revert workarounds#25990

Closed
jedisct1 wants to merge 2 commits into
ziglang:masterfrom
jedisct1:ktpar2
Closed

crypto.kt{128,256} cleanups & revert workarounds#25990
jedisct1 wants to merge 2 commits into
ziglang:masterfrom
jedisct1:ktpar2

Conversation

@jedisct1

Copy link
Copy Markdown
Contributor

Use atomic writes for CVs to avoid inconsistencies when hashing the leaves on platforms with weak memory ordering.

Also do not rely on the cpu count. Instead, set a maximum number of chunks per thread. As a bonus, we get better cache locality.

Fixes #25985

@jacobly0

jacobly0 commented Nov 20, 2025

Copy link
Copy Markdown
Member

I'm pretty sure I see the io impl bug that could be causing this, can you please revert any workarounds you have attempted for this bug so that I can actually track it down, don't want to just revert the commits in case they contain unrelated cleanups that you do want to keep. Also, you may have to either not use io or keep these tests disabled until it is functional.

@jacobly0

jacobly0 commented Nov 20, 2025

Copy link
Copy Markdown
Member

Hmm, my idea was wrong, but for the record any stores inside a groupAsync function are intended to be visible to any loads after calling groupWait on that group, so this would have just been working around and possibly hiding some other bug.

@jedisct1 jedisct1 changed the title crypto.kt{128,256}: make parallel processing more reliable crypto.kt{128,256} cleanups & revert workarounds Nov 20, 2025
@jedisct1

Copy link
Copy Markdown
Contributor Author

any stores inside a groupAsync function are intended to be visible to any loads after calling groupWait on that group

Thanks for clarifying, Jacob.

Here, each thread accesses a distinct memory region and fills its own chaining value slot, and the order in which threads run or finish doesn't matter. Combined with the fact that this behavior only appears on architectures with weak memory ordering, and that even there it happens nondeterministically, is why an ordering issue is the explanation that made sense to me.

That was an opportunity to clean up the code, but if the symptoms persist, instead of disabling tests, we should probably temporarily remove anything that uses threads in std.crypto to be safe.

@jacobly0

Copy link
Copy Markdown
Member

The mips one was in qemu on a strongly-ordered system, so no memory accesses were actually weak, and also the test itself is already non-deterministic, so that property of the failure can not be used to make other assumptions.

Also do not rely on the cpu count. Instead, set a maximum number
of chunks per thread. As a bonus, we get better cache locality.
@jedisct1 jedisct1 closed this Nov 22, 2025
@jedisct1 jedisct1 deleted the ktpar2 branch November 24, 2025 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky crypto.kangarootwelve.test.KT256 sequential and parallel produce same output for large inputs failure on LoongArch CI

2 participants